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Abstract 

Consider discrete values of functions shifted by unobserved translation effects, 
which are independent realizations of a random variable with unknown distribu- 
tion modeling the variability in the response of each individual. Our aim is to 
construct a nonparametric estimator of the density of these random translation de- 
formations using semiparametric preliminary estimates of the shifts. Building on 
results of Dalalyan et al. (2006), semiparametric estimators are obtained in our dis- 
crete framework and their performance studied. From these estimates we construct 
a nonparametric estimator of the target density. Both rates of convergence and an 
algorithm to construct the estimator are provided. 

Keywords: Semiparametric statistics, Order two properties. Penalized Maximum Likelihood, 
Practical algorithms. 
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1 Introduction 

Our aim is to estimate the common density of independent random variables 9j, j = 
1, . . . , Jn, with distribution fi, observed in a panel data analysis framework in a translation 
model. More precisely, consider J„ unknown curves t — > f^^{t) sampled at multiple points 
tij = ti = i/n, i = 1, . . . ,n, with random i.i.d. translation effects 6j, j = 1, . . . , Jn, in the 
following regression framework 

Yij = - dj) + (TEij , i = 1, . . . , n, j = 1, . . . , (1) 

where Eij are independent standard normal A/'(0, 1) random noise and are independent of 
the 6'j's, while a is a positive real number which is assumed to be known. The number of 
points per curve is denoted by n while Jn stands for the number of curves. 

Equation ([T]) describes the situation often encountered in biology, data mining or 
econometrics (see e.g [20] or [6]) where the outcome of an experiment depends on a 
random variable 6 which models the case where the data variations take into account the 
variability of each individual: each subject j can react in a different way within a mean 
behaviour, with slight variations given by the unknown curves /'•'l Estimating Lp, the 
density of the unobserved 6'j's, enables to understand this mean behaviour. 

Nonparametric estimation of belongs to the class of inverse problems for which the 
subject of the inversion is a probability measure, since the realizations dj are warped 
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by unknown functions /'■'''s. Here, these functions are unknown, hence the underlying 
inverse problem becomes more than harmful as sharp approximations of the 6'j's are needed 
to prevent flawed rates of convergence for the density estimator. While the estimation 
of parameters, observed through their image by an operator, traditionally relies on the 
inversion of the operator, here the repetition of the observations enables to use recent 
advances in semiparametric estimation to improve the usual strategies developed to solve 
such a problem. 

Note that estimation of such warping parameters have been investigated by several 
authors using nonparametric methods for very general models, see for instance [121 [E] , or 
[21]. However little attention is paid to the law of these random parameters. Moreover, 
as said previously, sharp estimates of the parameters are required to achieve density 
estimation, which requires semiparametric methods. 

Our approach consists, first, in the estimation of the shifts 9j while the functions f^^ 
play the role of nuisance parameters. We follow the semiparametric approach introduced 
in [7| in the Gaussian white noise framework and extend it to the discrete regression 
framework. This provides sharp estimators of the unobserved shifts, up to order 2 ex- 
pansions. Alternative methods can be found in [TT] or [23]. These preliminary estimates 
enable, in a second time, to recover the unknown density ip of the ^^'s as if the shifts 
were directly observed, at least if Jn is not significatively larger than n. This paper also 
provides a practical algorithm, for both the semiparametric and the nonparametric steps. 
The first step is the most difficult one: to build practicable semiparametric estimators, 
we propose an algorithm which refines the one proposed in [15] for the period model and 
relies on the previously obtained second order expansion. 

Beyond the shift estimation case, which involves a symmetry assumption on our 
procedure may be applied to semiparametric models where an explicit penalized profile 
likelihood is available and well-behaved estimators of the 6'j's can be obtained. A partic- 
ularly important example in applications is the estimation of the period of an unknown 
periodic function, see for instance ^L5\. Given a sequence of Jn experiments like the one 
considered in [IS], one might be interested in estimating the law of the corresponding 
periods of the signals. In this case one can also consider applying our method, under 
some conditions made explicit in the sequel. 

The paper falls into the following parts. In Section [21 semiparametric estimators 6j 
of the realizations of the shift parameters are proposed, and sharp bounds between 9j 
and 6j are provided. Then, in Section [3l a nonparametric estimator of the unknown 
distribution is considered while rates of convergence are provided in the case where /i 
admits a density, in the general model ([T]) under the condition that the 6'j's can be 
sufficiently well approximated. In Section 4, the practical estimation problem is considered 
and a simulation study is conducted. Technical proofs are gathered in Section [5l 

2 Semiparametric Estimation of the shifts 

In this Section, we provide, for each fixed j, semiparametric estimators of the j*^ re- 
alization 6j of the random variable 6, observed in Model ([T]). To build this estimates, we 
follow the method introduced by Dalalyan, Golubev and Tsybakov in [7J for a continuous- 
time version of the translation model. We obtain analogues of two of their results in our 
discrete-time model: a deviation estimate stated in Lemma 1231 and a second order expan- 
sion for the estimators stated in Lemma [231 We also establish a new result in Lemma [275| 
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which enables to control the bias of the estimates. The particular form of the estimators 
and the second order expansion will not be used to build the density estimator in Section [3l 

Hence, conditionally to the event 6j = 9, we construct an estimator 6j and establish 

asymptotic results for the conditional distribution (^9j \ 9j = 9^ , gathered in Lemmas 12.31 

12.41 and 12.51 In the remaining of this Section, since j is fixed, the index j in the notation 
is dropped (for instance Yij is simply denoted by Yi). We shall denote by ||.{| the L^-norm 
on [0, 1] and by ||.||oo the L°°-norm on M. 

2.1 Shift estimation in the discrete time translation model 

The model reduces to, assuming for simplicity that cr = 1, 

Y^ = f{ti-9)+€i t = l,...,n, (2) 

where / is a symmetric function satisfying some additional assumptions detailed below 
and ti = i/n. The corresponding problem is the one of semiparametric estimation of the 
center of symmetry in a discrete framework. 

Working assumptions in the translation model 

We assume that the support G of the distribution /i of the random variable 9 is compact 
and contained in an interval of diameter upper-bounded by 1/2 

{Al) Q = {9, |6'| ^ To}, where To is such that < tq < 1/4. 

The function / is assumed to be symmetric (that is, f{x) = f{—x) for all real x) and 
periodic with period 1 with Fourier coefficients denoted hj fk, k ^ 1, 

{A2) /(t) = V2^/fcCos(27rH), where fk = V2 [ f{t)cos{2TTkt)dt. 

Let C^(M) denote the set of all twice continuously differentiable functions on M. We 
assume that there exist p > and Cq < +oo such that / belongs to the set F defined by 

{A3) F = F{p,Co) = {feC\R), fl^p, \\r\?^C,}. 



Conditions (Al)-(A3) can be seen as working assumptions and are essentially the same 
as in [7]. Assuming periodicity of / is not a drawback since, in practice, the function / 
is compactly supported and can easily be periodicized. The assumption that / belongs 
to C^(M) is handy in particular for proving the second order properties of the estimator. 
Note also that for simplicity in the definitions of the classes, as in [7] we have assumed 
that the Fourier coefficient for = 0, that is f{u)du, is zero. 

Identifiability in model (2) follows from : symmetry, 1-periodicity of the functions (note 
that assuming that /f ^ p implies that / cannot be periodic of smaller period) and the 
fact that the diameter of 9 is less than 1/2. 

Note also that within this framework, the Fisher information for estimating 9j for a 
fixed j is, as n tends to +oo, given by {1 + o(l)}n||/'||^. 
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Construction of the estimator 



Let us define an estimator 9 of the shift 9 in model ^ by 

(i " y 

6' = argmax y h^x — y cos(27rA;(tj — r))l^ 1 , (3) 

U VU J 

where {hk) is a sequence of real numbers in [0, 1] satisfying some conditions made precise 
in the following subsection. The sequence (hk) is called sequence of weights or filter. 

The estimator 9 is similar to the estimator 9pml proposed in [7]: here the integral in 
their definition is replaced by the equivalent discrete sum in the discrete-time model. As 
we sketch below, the estimator arises in a natural way by using a penahzed profile 
likelihood method as in [7], though here in an approximate way only, 

First one turns the study of the regression model into the study of a sequence of 
independent submodels. Let us introduce, for any k ^ 1, 

1 " 

Xfc = - ^ V2cos{2'Kkti)Yi, 6 = Ya=i V2 cos{2nkti)ei. 

^ i=l 

1 " 

- ^ V2 sin(27rH,)y„ Ck = T^ti ^ sin(27rHi)£,. 



'Xh. — 

1=1 



Note that and x\ are observed. Using the fact that Yi follows (E]), 

Xk = cos{2'7rk9) fk + 4,n + A=^k, (4) 

m 



xl=sm{27rk9)fk + dl^ + ^Ck, (5) 



where dk,n, d\ ,^ are terms of difference between the Fourier coefficient and its approxima- 
tion: 

4 „ = v/2 I - V cos{27rkti)f{ti - 9) - cos{2nkt)f{t - 9)dt 
V i=i y 
The term dl „ is obtained in a similar way replacing the cosine by a sine. We would like 
to underline two important facts about the previous quantities. First, since the e^'s are 
Gaussian Af{0,l), the variables {^k,Q)k^i are also Gaussian and, since we assume that 
ti = i/n, using the orthogonality of the trigonometric basis over this system of points, 
they are in fact independent standard Normal. Second, it is important to note that both 
dk,n and d^^ are non-random and bounded uniformly in 9. We will use more precise 
bounds in function of k and n in the proofs, see Lemma 15.11 in Section [5l 

The penalized profile likelihood method is as follows. For each integer k and r G B, 
let us define the quantity Pr{xk, xl, fk) as 

I ^..^ , -{xk-cos{2TTkT)fk-dk.rf--{xl-sm{27rkT)fk-dl )^--^ 
2-K \ 2 2 2at 



exp \^--{xk - cos{2nkT)fk - 4,n) - -{xl - sin(27r/cr)/fe - 4 „) 

This is the usual likelihood corresponding to the observation {xk,xl) with an additional 
penalization term —fl/2a\, where 0"^ has to be chosen. The profile likelihood technique 
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(see [221 Chap. 25]), consists in "profiling out" the nuisance parameter fk by setting 

/fc(^) = argmax Prixk, xl, fk) 

fk 

^PML = argmax TT j9^(xfc, 4, (r)). 

Here a difficulty is that dk,n and dl ,^ depend on fk. However, if we neglect those terms, 
we can follow the calculations made in [7] and we obtain that Opml is the maximizer of 
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^fc I — cos(27rA;r)xfc + sin(27rA;r)x^ j 
k^i \ i=i ) 



which is exactly the same as ([3]) if we set hk = crl/ (cr| + n^^). Thus the criterion ([3]) can 
be obtained by an approximate proffie likelihood method. Yet, it is not trivial to see at 
this point if having neglected the terms dk^n, d\ ^ in the construction of the estimator can 
have a negative influence over the behavior of the criterion In fact, we will see in the 
sequel that this is not the case, and that ^ can lead to a very good estimator, even at 
second order, provided a sensible choice of {hk) is made. 

2.2 Asymptotic behavior of the shifts estimators 

First let us precise some technical assumptions we make on the sequence of weights 
[hk) in the definition ^ of the estimator. These are the same as Assumptions B and C 
in [7], except that here we also restrict ourselves to a finite number of nonzero weights. 

Assumptions on the sequence of weights [hk) 

Let the sequence {hk) be such that /ii = 1, ^ /i^ ^ 1 for all k ^ 1 and assume that 
there are positive constants Di and pi such that 

(CI) The number of weights such that hk ^ is finite. 

1/2 



(C2) 



.k>l 



{2nk)'hl 



^ pi(log^n) max(27r/c)/ifc. 



k>l 



(C3) J2^k{2TTk)^ ^ Dm. 
(T) (j2(^-hk){2irk)'f!] =o(^(l-M^(2vrA:)V,^ 



The ffist condition is quite natural to make the estimator feasible in practice. Conditions 
(C2) and (C3) precise the range of the sequence {hk). Condition (T) allows to obtain 
second order properties, see the proof of Lemma 12. 4[ 

Remark 2.1. As noted in [7|, conditions (CI), (C2), (C3) and (T) are fulffiled for a quite 
wide range of weights. For instance, the sequences {hk = li^fc<Ar{T)), also called projection 
weights, satisfy the preceding conditions since (C2) and (C3) are satisfied respectively for 
N{T) ^ Clog^n and N{T) ^ Cn^^^, while condition (T) is always satisfied for projection 
weights since Ylik>N{T)^'^'^^)'^ fl — ^ 0, as n — > +oo, due to (A3). 
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Remark 2.2. It is also be possible to consider random, data-driven, weights. This ap- 
proach is considered in |8]. 

Asymptotic properties 

For easiness of reference in Section 3, it is convenient here to make the dependence in j 
explicit again. In the following Lemmas, /'■'l' and fj^^ respectively denote the derivative 
and the Fourier coefficients of /'-^l Lemmas 12.31 and 12.41 are respectively analogues of 
Lemma 5 and Theorem 1 in [7] here in a discrete regression framework, which seems to 
be closer to practical applications. Though it seems natural that the cited results extend 
to our context, it is not obvious that the extra terms induced by the discretization of 
the model, for instance the (i„ ^'s introduced above, do not interfere with the rates, in 
particular at the second order. But we prove that they eventually do not, see the proof 
of the Lemmas in Section [5l 

Lemma 2.3 (Deviation bound). Assume that (A), (C), (T) are fulfilled. For any K > 
and any positive integer n, denote x„ = Ky/\ogn. There exist positive constants ci, C2 
such that for any K > for n large enough, uniformly in j G {1, . . . , Jn}, Oj G 6 and 
/[Jl G F, It holds 

P (^Vn\9j - 9j\ > Xn\9j^ ^ ci exp(-C2X^). (6) 

Lemma 2.4 (Second order Expansion). Assume that (A), (C), (T) are fulfilled. Let us 
denote R^[h, f^^] = Er=i(2vrA;)2[(l - hkffj^^' + hl/n]. Uniformly zn j G {1, . . . , J„}, 
6j G and f^^^ E F, as n tends to +oo, 

^ i^'' - "'''I*') = ;4W¥ + + "''"TM?) ■ 

Lemma [?!^ implies that, conditionally to Oj, the estimator 6j is efficient for estimating 
6j at the first order. It also provides an explicit form for the second order term of the 
quadratic risk. The explicit expression of this term is not needed to establish the conver- 
gence rate of the plug-in estimator in Section [3l Nevertheless it justifies the choice of the 
filter made in Section HI Indeed, we see from that an appropriate filter [hk) is a filter 
such that is as small as possible. 

The following Lemma 12.51 is new with respect to [7] . It ensures that the conditional 
law of 6j is centered at 6j, up to a 0(logra/r;,) term. 

Lemma 2.5 (Asymptotical Bias). Assume (A), (C), (T) and that there exists a positive 
constant D such that for any f E F , it holds 'Ylik>i^'^\fk\ ^ D. Then, uniformly in 
j G {1, . . . , Jn}, as n tends to +oo, 

E [{9, - 9,)\9,) = O (^^y (8) 

This Lemma requires only slightly more regularity on / than a second derivative 
bounded in L^, which is what condition (A3) imposes, that is X]fc>i ^^/l ^ ^o- K enables 
us to have slightly broader framework for our results in Section [31 However, one can still 
obtain interesting results without using this Lemma, see Remark 13.51 after Theorem 13.31 
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2.3 Case of the period model 

Let us now consider tlie period model mentioned in the introduction, where symmetry 
of the functions is not assumed. The random variables 6j arise this time as period of 
periodic functions. The observations in a fixed and equally spaced design are 

Y., = /[^l (^^^ + e,, I = -n/2, . . . , n/2 , j = 1, . . . , J„ , (9) 

where the ^^'s belong to a compact interval in ]0, +cx3[ and the 1-periodic functions /'■'l 
fulfill some smoothness assumptions, for instance the ones assumed in |1] in the Gaussian 
white noise framework. 

It is established in [1] that the penalized profile likelihood method yields estimators 
satisfying, with appropriate rescaling, statements similar to ([6]) and ([7]), in the continuous- 
time model, see [H Lemma 11 and Theorem 1]. 

Hence one can apply the method of this paper for estimating the law of the ^^'s in 
model ([9]), provided one can transpose the proofs of ©-(IT!) for the continuous-time model 
in terms of the discrete framework, as is done here in Section 5 for the shift model. 

3 Nonparametric estimation of the distribution /i 

We are interested in the estimation of the distribution of the unobserved sample 
^1, . . . ,9j^ in model ([T]). We shall assume that the number of curves J„ tends to +oo. 
Our approach is based on the assumption that, along each curve, one can estimate in an 
appropriate way the corresponding 6j. 

More precisely, the realizations 6'i, . . . , 6j^^ are unknown but we assume that they can 
be approximated by some preliminary estimators Oj^n, for j = 1, . . . , J„, denoted for 
simplicity 6j in the sequel. 

Definition 3.1. We say that the random variables 6i,...,6j^ approximate the sample 
9i, . . . ,9j^ if for each j, the variable 9j is built using the observations Yij, . . . , Ynj (i.e is 
measurable with respect to these observations) and satisfies the deviation bound ([6]) given 
in Lemma [2.31 

Note in particular that with this definition, the random variables 6j are independent. 
The fact that ([6]) holds roughly means that the ^^'s are approximated by the ^^'s at almost 
parametric rate with an exponential control of the deviation probability. 

In Section [2], we have studied in details a possible way of obtaining 6'j's satisfying this 
approximation property in model ([T]). However, we would like to point out that the results 
of Section [3] hold as long as the 6'j's are approximations of the Oj's in model ([T]) in the 
sense of Definition 13. II (for Theorem 13 . 3 1 b elow . we shall also assume that holds), which 
possibly allows using estimators produced by other methods. Extensions to frameworks 
beyond model ([I]) could also be considered. 

3.1 A discrete estimator of /i. 

A first way to define an estimator of /i is to consider a plug-in version of the usual 
empirical distribution, defined using the preliminary estimates 6j as 
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The empirical distribution computed with the conditional estimators of the shifts provides 
a consistent approximation of the distribution of the true random shifts, in a weak sense. 



Theorem 3.2 (Weak consistency of the plugged empirical measure). Assume (Q, that 6 
is compact and that there are positive finite constants a and B such that Jn ^ Bn'^ and 
Jn — > +00. Then it holds 

Jn A* almost surely, (11) 
which means that for all continuously differentiable compactly supported function g, 

i^Jn9 = T 5Z ^(^j) ^1^9 = E(fi'(6')) almost surely. 
Proof. For g a continuously differentiable compactly supported function, we get that 

r^j.9 = Yll{9{o,)-g{e,)) (/) 

" j=l 

The law of large numbers ensures that, almost surely, 

{n)'--^¥.{g{d)). (12) 

Now Taylor upper bound leads to I:^E/=i (^(^i) - ^(^i)) I ^ ir E/=i \\9'\\oo\dj - dj\. If 

||fl''||oo = 0, then the previous quantity is equal to zero. Now consider the case ||5''||oo 7^ 0. 
Hence, using prior bound and dU]), we get for any A ^ 



(-1 Jn \ Jn / 

" j=i / i=i 



~ ^il ^ II 'II I 



^ CiJ„exp ( -C2 



A 

\\9' woo 

A^nll/yi'f 



iy iioo 



which is uniform in {6j)j=i,...,j„. Then, choosing A = c^/\ogn/n leads to the following 
bound. 

For c large enough, namely for all r/ ^ 0, ^ (1 + a + r/)||^'||^/(c2||/[^l'in, we can write 



/ 1 ^ 

V " i=i 



(1 ^ 
" i=i 



Borel Cantelli's Lemma enables us to conclude that a.s. 



1 

Finally (fT^ and (|T5|) prove the result. □ 
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Hence, we have constructed a discrete estimator of the law of the random shifts. 
Nevertheless, in many cases this estimator is too rough when the law of the unknown 
effect has a density, said with respect to Lebesgue's measure. That is the reason why, 
in the following, a density estimator is built, for which we provide functional rates of 
convergence. 

3.2 Estimation of the density of the random deformation 

Consider the following kernel estimator of y?, the density of the unobserved 6'j's in 
model dl]), based on a kernel K, to be specified in the following, and on the quantities 9j. 
For all X in 0, let us define 



^i-)=^y.K (14) 




In this subsection, we shall assume that the quantities 6j satisfy the approximation prop- 
erty stated in Definition 13.11 and also the control on their expectation provided by ([8]). 
We have checked in Section 2 that both properties are fulfilled under some regularity 
conditions for the estimators 6j built in Section 2.1. 

Let us denote by Hm{P,L) the set of all densities ip with support included in the 
interval [— To,ro] = 0, which belong to the Holder class H{(3,L) (see p. 5) and are 
uniformly bounded by a positive constant M. 

For clarity in the following statement, we shall assume that for n large enough, either 

2/3+1 

Jn ^ {tt-/ logn) "+2 or the converse inequality hold, for (3 defined below (otherwise use a 
subsequence argument). 

Theorem 3.3 (Rate of convergence of the nonparametric estimator). Let us assume that 
(f belongs to the class Hm{P,L) for some positive L and M and with f3 > 1. Assume 
moreover ([6]), ([8]) and that the kernel K is smooth, compactly supported, of order L/3J. 
Then the kernel estimator if defined by (1141) achieves the following rates of convergence, 
as n and Jn tend to +oo, 

fofj;^y if (n/logn) W 

sup sup E {[0{x) - ^{x)f) = { y ; ^^^^ (15) 

xee^e//M(/3,L) I O ( (n/ logn)^^+^ j , if J„ ^ (n/ logn) W 

2/3 

Thus the classical rate of convergence Jn ^''^^ of density estimators over Holder classes 
H{P,L), with /3 > 1, is maintained, provided the number of curves J„ does not exceed 

2/3+1 

(n/logn) ''+2 . In fact, it can be checked, using standard lower bound techniques, that 
in model ([T]), the minimax rate of convergence of the pointwise mean-squared risk for 
the estimation of (p over the considered Holder-class is not faster than constant times 

2/3 

Jn '^'^^^ , which yields the rate-optimality of the procedure in this model in the first case of 
the Theorem. In the other case, the number of points per curve n becomes the limiting 
factor, and a slower rate specified by (ITSl) is obtained. Whether this second rate is optimal 
is an open question. 

Our results can be interpreted as follows. The inverse problem is drastically reduced 
when the number of observations per subject increases, enabling, in a way, to invert the 
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convolution operator. A nonparametric estimation of the density of the unobserved pa- 
rameter in a regression framework can only be achieved if there are numerous observations 
for each curve. In our case, the fact that asymptotics can be taken both in J„ and in n 
enables us to estimate first, for each curve, the random effect and then plug the values 
to estimate the density. If the number of observations per curve is small, as it is usually 
the case in pharmacokinetics, such techniques cannot be applied and we refer to [6] for 
an alternative methodology. 

Proof. For simplicity in the notation, we assume throughout the proof that the 6'j's are 
identically distributed - let us recall that they are independent -, which enables us to just 
deal with j = 1. If this is not the case, then one can still use the independence and then 
bound the different quantities arising j by j. Also we denote hn simply by h. 
First note that the bias- variance decomposition is, for any x in G, 

E ([^(x) - = (E[^(x)]-y.(x)f + E ([^(x) - E(^(x))]2) 

= b{xy + v{x). 

Let us denote by A the quantity 9i — 9i. Note that, by definition of ^i, A is a measurable 
function of (6'i, {£ii}i=i,...,n)- In the sequel, we denote A = g{9i,e). 

Let us denote by Ai = {\6i — 6i\ ^ D{n^^ log nY^'^}. Using ([U]), the probability of 
its complement is negligible. 

A Taylor expansion of the kernel K at the order k yields the existence of a random 
variable Z such that 



h \ h j h \ h h 

- iEA- f ^) (16) 



(IT) 



h \h \ h 



h \ h 

^BfCA-r^lV (20, 



h Kh'' \ h 



Note that, by the usual properties of a kernel of order L/3J, see e.g., jSU Theorem 1.1], for 
some positive constant C it holds 

KHHD -</^(x)| ^ ch^ 



10 



It is assumed that the 9j's satisfy (IHl), thus 

m = 

Clogn f 1 



nh J h 
Clogn 



K' 



u — X 
h 



ip{u)du 



nh 



\K' {v)\ip{x + vh)dv ^ 



Clogn 
nh 



\K'\. 



Sphtting (|T^ using Ai and its complement, 

, , C log n 

By the same argument, 



/i/n. 



p=3 



nh"^ 



nK^ 



as soon as \ogn/{nh'^) 0. Finally, 



, , CI log n\ C ( log n 



h \^ nK^ 



nh W-^h^k+^ 



1/2 



Thus 



b{xf ^ c 
The variance term is bounded by 



1 

1 
1 



E 



E 



h 



K 



n — X + g{u^ e) 



h 



Jnh 



-E 



K V 



g{x + hv, e] 
h 



ip{u)du 



ip{x + hv)dv 



C_ 

Jnh 



(21) 



ll^lloolli^lloo. 



Finally, choosing k large enough in (1211) . we obtain, for any x in B, 



E ([^(x) - v{x)f) ^ c 



h^f' + 



1 log^n"' 

nh nh^ 



+ 



C 

Jn.h 



oo\\K\\oo 



(22) 



To obtain the rate of convergence of if, we distinguish two cases, depending on whether 
the second or the third term in the preceding display is dominant. 
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• If Jn ^ {n/ logn) '5+2 , then choosing hn = n'^i^+^ implies that ^ ^°nh'^ ^ TTi' leading 
to the rate 

2/3+1 1 , 1 2 ^ 

• If J„ ^ (n/logn) "+2 , then choosing hn = (log n/n) ''+2, implies that ^ '^^3" ^ j^, 
leading to the rate 

/ o\ 2/3 

E ([(^(x) - v'(x)]^) ^ cn '3+2. 
Other choices of hn can easily be seen to lead to slower rates when optimizing fl22p . 

□ 

Remark 3.4. Note that the difficulty of the proof relies on the fact that, a priori, A = 
61 — 61 and 61 are not independent, see for instance the expression of the shift esti- 
mator given by ([3]). Thus one cannot easily change variables in integrals of the type 

since g depends on u. 

Remark 3.5. Theorem 13.31 requires the conditions (3 > 1 and ([H]) to be fulfilled. However, 
if one (or both) of these two conditions is not assumed, then it is not difficult to check 
from the preceding proof, using the rough bound |( fT71) | ^ c/{y/nh), that one can still 
recover a rate of convergence given by optimization in h of h"^^ + 1/ {nh"^) + 1/ (Jnh) . This 

2/3 _^ 2/3+1 

leads to a rate in Jn ^'^^^ (respectively n 2/3+1 ), for J„ smaller (resp. larger) than n 2/3+2 . 



4 Simulations 

In this Section, we first present how the shift estimators studied in Section [2] can 
be numerically implemented. The estimation method proposed here is interesting on its 
own, since it provides a numerically tractable semiparametric estimator of the translation 
parameter and generalizes the penalization method proposed in [15] . Second, we construct 
the nonparametric estimator of the density defined in Section 3.2 and illustrate its behavior 
on both simulated data and real data. We point out that in the considered examples, we 
deal with the case where = / which is often used in practice where individual effects 
is only expressed through a warping effect of a main behaviour modeled by a (common) 
unknown function /. 

4.1 Numerical algorithm for shift estimation and extensions 

To compute explicitly 9n given by (jHl) for each curve, one has to choose an appropriate 
filter {hk). According to Lemma [231 a good choice of weights should make the remainder 
term i?„[/z, /'■'l] small. The authors in [7j determined a sequence (hk) - roughly, a well- 
chosen sequence of Pinsker weights - such that the second-order term is optimal from the 
minimax point of view. However, this choice depends on the regularity parameter of the 
function f^^\ which are not known in practice. 

Here we use an adaptation of the penalization technique introduced in to determine 
an appropriate sequence (hk). However, contrary to that paper, where only projection 
weights hk = l\k\!^K are considered, note that the criterion ([3]) enables the use of a much 
broader variety of weights (hk), provided these satisfy conditions (C)-(T). As explained 
below, the use of Pinsker weights, see f l23l) . enables a smoothing in the criterion which 
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improves estimation with respect to |15j . 



We also would like to mention an alternative method based on a data-driven choice 
of the filter, proposed in This method is very interesting in particular from the the- 
oretical point of view, since it achieves an optimal minimax second-order term and is 
adaptive to the regularity of /'■'l But the method is asymptotic in nature and, though 
it performs well for not too complicated signals, our method seems more appropriate for 
complicated signals (that is, with possibly many non-zero Fourier coefficients) f^^\ as in 
the laser vibrometry example below. 



Consider the class of Pinsker-type weights, depending on the parameters K and /5, 
defined by 



(k/K) 



k>0. 



(23) 



and K is called the length of the sequence of weights. 



Hence a sequence of weights is characterized by the pair {(3,K). To simplify, we fix the 
value of P and take (3 = 3. Thus the family of weights depends on the single parameter 
K. For any filter sequence of length K, we define 



K 



k=l 



1 " 

-^cos(27rA:(ti - T))Yi 



n 

i=l 



(24) 



where Yi, i = 1, . . . ,n is the data corresponding to one curve in model ([T]). To make 
the estimator feasible, we take the values of r in a fixed regular grid of mesh 1/m: 
{ti, . . . , Tj+i = Tl +i/m, . . . , Tmax}i of range inferior to 1/2 (let us recall that the diameter 
of has to be bounded above by 1/2). Let us define 

t{K) = argmax Ak{t), 

M{K) = AK(r). 

Penalization. We would like to find an adapted sequence of weights, or equivalently an 
integer K. This is done, as in fT5] or [1], using a penalization method. Let 

k{a) = argmax {-M{K) + aK}. (25) 

The parameter a should yield a trade-off between the fit with the data and the filter 
length K that can be viewed as the complexity of the chosen model. We use a data- 
driven method to find an appropriate a. The idea is to detect the changes in the convex 
hull of the function K — —M{K). Let us recall the following lemma from 



Lemma 4.1. There exist two sequences Ki = 1 < K2 < ■ ■ ■ , and = +00 > ai > 

with: 

_ M{K,) - M{K) _ M{K,^,) - M{K,) 

^ Kp<K<iKma^ Kp- K Kp+l - Kp 

and such that 

V a G [ap] ttp+i], K{a) = Kp. 
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Note that connecting the points {Ki, M{Ki)) gives the convex hull of the function 
K —M{K) over the points Ki, . . . , Kmax- Let us define our estimator of the period as 



p : T{Kp)=Ti 

In words, the estimator chooses the point at which the cumulated jump in the derivative 
is the highest. Note that different values a* of a led to such an estimator, which satisfies 
the identity r* = f{k{a*)). 

Illustration of the algorithm. Figure [1] illustrates this algorithm with / equal to = 
0.015 * cosjlOO cos(7r(a; — t))}. The first graph represents the criterion —M[K) together 
with, in dotted line, its convex hull. The stem diagram represents the differences ap_i — 
corresponding to the points Kp. Finally the estimated shifts for the different values of K 
are represented by the last graph. The numerical parameters are the following: n = 800, 
and the true shift parameter r = 0.35. The grid for r is the regular grid of [0.25,0.75] 
with 100 points. Note that, due to the high level of the noise (small amplitude of the 
signal with respect to the noise variance), it is difficult to detect visually the changes in 
the criterion behavior. Nevertheless the algorithm succeeds in finding the true shift. 

The number of significative harmonics of /i is roughly 100, thus if we knew /i, taking 
K of the order of 100 would be a reasonable choice in view of ([S]). In fact, for k much 
larger than 100, the corresponding elements in the sum of squares in are mainly noise. 
As we see in Figure [H with the choice of r given in fl26p , our algorithm chooses K in the 
appropriate interval. 

Compared to the method used in [15], the use of the Pinsker-type weights fl23|) allows 
a smoothing with respect to projection weights, making the detection of main jumps in 
the convex hull of K ^ —M{K) less sensible to local irregularities of K ^ M{K), which 
slightly improves estimation. An extensive numerical comparison of the use of the two 
type of weights, which is beyond the scope of this paper, is carried out in [3J in the case of 
the period model for discrete design and gains of 10 to 20 % in the estimation of 6 are ob- 
served for a laser vibrometry example with the unknown / similar to the function /i above. 

The period model. We note that this algorithm can also be implemented for the pe- 
riod model ([9]) and more generally if the penalized profile likelihood is known in a closed 
form. For the period model, the algorithm follows the description above, once one replaces 
the cosine in fl^^ by its equivalent cos(27rfc(tj/r)). A numerical study is carried out in 
[3], leading to similar conclusions than the one presented here. 

4.2 Numerical algorithm for density estimation 

Once obtained the estimators of the realizations 6j, j = 1, . . . , J, we can build the 
estimator of the density ip defined by ffT^ . We illustrate the good behaviour of our al- 
gorithm with three examples. The first one shows that important features of the target 
density such as bimodality can be detected with our method. The second example shows 
that even with quite involved functions, for which the semiparametric step is not easy, 
the methods performs well, at least if the signal to noise ratio is not too small. The third 
example deals with a practical application where symmetry can be seen as a sensible 
assumption. 



r* = argmax 





(26) 
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Simulated data (I). The function / is the sine function on an half period, while the 
law of the shift ^ is a bimodal mixture of compactly supported densities. We perform 50 
random translation of the original curve with n = 100 observations per curve. In Figure [21 
we present the observed curves in model ([T]). To study the performance of the estimator 
described in this paper, we first computed the preliminary estimates 6j obtained by the 
semiparametric method of Section [21 using the practical algorithm of Section [H This set 
of values was then used to build two nonparametric estimators of the density denoted 
respectively SPGaussian and SPepanech, using f[T^ and respectively a Gaussian kernel 
and Epanechnikov kernel. The smoothing parameters are chosen by cross-validation. 
We compare their performance with an estimate constructed the following way. Using the 
algorithm described in [21], applied in the shape invariant model and following the lines 
of Section 3.3 in [24], we compute nonparametrically the values of the warping parameter, 
used to align the curves to the true shape. Then, using Epanechnikov kernel, we build 
the corresponding density estimator, denoted by NPplug. 

Figure 3 carries out the comparison between the preceding estimators. Visually, 
the estimators SPGaussian and SPepanech detect the density shape and bimodahty and 
SPepanech matches slightly better the density amplitude. The nonparametric-based ker- 
nel estimator NPplug catches the global shape of the bimodal density but is too rough, 
since the method it relies on is far too general with respect to the semiparametric method 
designed to handle this particular situation. Hence, plugging a semiparametric prelimi- 
nary estimate into a kernel-type estimator leads to a tractable estimator of the density of 
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the shifts without knowledge of the shape of the warped function. 




Figure 2: Simulated shifted curves. 



1.5 




0.75 



Figure 3: Estimators of the shifts density. 



Simulated data (II). We consider a function similar to the one introduced in the preceding 
subsection: f{x) = cos (100cos(7r(x — 0.35))). Such functions appear for example in laser 
vibrometry and are studied in [15] and [1] for the period estimation problem. In this 
case, the semiparametric estimation step is crucial since the data are fuzzy. In Figure H] 
we represent the original curve (left picture) and both the true shift density (f and the 
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Figure 4: Simulated laser vibrometry-type functions 



estimated density, plotted in dotted line (right picture). The curves have been shifted 
using a compactly supported smooth density y?, with n = 100 observations and J„ = 30. 
The two functions, the true function ip and the estimate, are still visually relatively close. 

Real data. We present in Figure [5] an estimation conducted on real data. This data, pro- 
vided by ACI-NIM MIST-R ( |http: / /www.lsp.ups-tlseTfr/Fp/LoiTbes/ACI.htmll , are daily 
velocities of vehicles on a motorway on the suburbs of Paris. 

After a preliminary classification which aims at building groups of homogenous curves, 
we obtain several functional sets, each one representing a particular daily behaviour, as 
pointed out in [17J. For one group we get curves starting and ending at the maximal 
speed, while presenting some typical patterns which stand for a standard traffic-jam fea- 
ture, repeated mornings and afternoons. Due to classification, the different features have 
been split into different classes, as pointed out in [TTj. Hence the curves present some 
symmetrical aspect but the starting hours of these traffic jams change slightly around a 
mean time, starting sooner or later each day. Hence, the shift model can be used here, as 
done also in [TT]. 

In this study, we get a set of 32 curves with n = 180 observations which corresponds 
to a velocity measured every 8 minutes during a day, see the left-hand side of Figure 
Understanding roadtrafficking behaviour, involves first finding a mean pattern but also 
studying the density of the random shifts, in order to understand the reasons of this 
changes around the mean behaviour. The bimodal feature of the estimated density, see 
the right-hand side of Figure [5], can be later understood as the consequence of different 
weather conditions on the road network. 

5 Appendix 

First let us state a useful result about the control of Fourier coefficients by discrete 
approximations, which will be used in the sequel to control remainder terms. Let us 
denote 




(27) 



i=l 





i=l 
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Figure 5: Real data: velocities curves 



Lemma 5.1. For any f in the class F = F{p,Co) satisfying (Al)-(A3), there exists a 
constant C depending only on Cq such that, for any k ^ 1, 



\fk~fk\^C[-Al 
n 



fk 

and 1^1 ^ C - A 1 



(29) 



where a Ab denotes the minimum of the two reals a and b. 



Proof. For any continuously differentiable function if on the interval [0, 1] and tj = i/n, 
it holds 

n „f- 
/ f-i 



If 



-y^'^l^i)- / V{u)du = V / {(p{ti) - (p{u))du 

^ i=l "^0 i=l -^i-i 

denotes the supremum norm of the derivative of (f on [0, 1], we have 



-y^ip{ti) - / ip{u)du 

n ^ In 



i=l 



E 



1=1 



-,\ti — u\du 



^ y\\ooY,{t,-t,^^f/2 



i=l 



2n 



Now let us apply the preceding to the functions ipi{u) = cos{2TTk{u — 0))f{u — 6) and 
V^iu) = sin(27rfc(M — 9))f{u — 6) respectively. By symmetry and 1-periodicity of /, we 
have jj^ ip2{u)du = 0. For any real m, 

l^'iHI ^27rfc||/|U + ||/|U, 

and, similarly, the same bound holds for |v92(m)|. Now observe that ||/||oo and ||/'||oo are 
bounded if / belongs to the class F. Indeed, if / G F, then /' is continuously differentiable 
and 1-periodic thus it is the limit of its Fourier series. For any u G [0, 1], we have 

/'(w) = $^(-2vrA;)/fcsin(27rA;M). 

To see that the latter quantity is bounded it suffices to check that the series ^ kf^ 
converges. This is a consequence of Cauchy-Schwarz inequality and (A3) since 



1/2 



k- 
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is bounded. Similarly, ||/||oo is bounded by Xlfc^i l/^l' which is bounded if / is in F, which 
implies that fk — fk and gk are bounded by a constant times k/n. The fact that they are 
also bounded by a constant follows from the fact that ||/||oo is bounded over F. □ 

Lemma 5.2. Assume that conditions (A1)-(A3) and (C1)-(C3) are fulfilled. Then for 
some positive constant C , denoting = X]fc>i(27rfc)^/ifc, 



h'\ 



Thlk'^^Cn', Thkeifkl^C-^ and y^hkk''\fk{fk-fk)\^C 
^-^ ^-^ log n ^-^ 



k>l 



k>l 



\h'\ 



n 



Proof of Lemma \5.2[ Note that for any integer k, we have hlk^ ^ kkk^maxk^i hkk"^. The 
latter maximum is smaller than the corresponding sum over k which, due to (C3), is at 
most Diu. Using (C3) again, one obtains the first inequality. Then due to (C2), 



^hkk'^lfkl ^ ( maxhkk ) ^k\fk\ ^ C 



\h'\ 



log^ n ' 



which yields the second inequality. Finally, using fl^^ and the Cauchy-Schwarz inequality. 



1/2 



which yields the third inequality using (A3). 



□ 



We can now turn to the proofs of Lemmas 12.3112.41 and l2. 51 We shall work conditionally 
to the event 6j = 9. To abbreviate the notation, we omit the index j and drop the notation 
\6j). Thus the expectations in the following should be understood at fixed 6j. 

The main novelty with respect to [7j consists in proving that the arguments used by 
the authors in that paper can be adapted in the discrete setting, by showing that the 
arguments still go through when working with the discrete approximations fk and cfk 
instead of fk and respectively. 

Proof of Lemma \2. 3[ The contrast function to maximize is, according to ([3]), 



k>l 



n 



1 " 

- ^ v^cos(27rfc(t, - r))r, 
cos(27rA;(^^ — r))/^ — sin(27rA;(6' — r))^ 
'cos{27r kT)C,k + sin(27rA;r)^^ 



(30) 



where fk and gi are defined by ( !27j) - (!28l) . The criterion L{t) is the sum of three terms 

LiT) = Voir) + ^\\f'\Mr) + \2ir), 
'n n 



where 



t) = J2 hk[cos{27ik{e - r))fk - sin(27rA;(^^ - r))gk] 



k>l 
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^i(^) = Wf'V Yl hk[cos{2Txk{e - T))fk - sm{2Txk{e - T))g^][cos{2TxkT)ik + sin(27rA;r)e*] 

Tl2{r) = Y,hk[cos{2TxkT)ik + sin(27rA;r)e:)]2. 

Note that this is the analog of the decomposition of p. 185, except that here the 
quantity cos(27r/i;(^^ — r))/^ + sin(27rfc(^ — r))^ replaces cos{27ik{9 — t)) fk- Let us see how 
the argument is further modified. 

The stochastic term ri2 is exactly the same as in [7]. The term rji is such that its 
derivative ri[ is a zero-mean stationary Gaussian process and one has 

EK(r)^) = 411/11-2 5^ /.^(2vrfcr(^ + ?^), 

So, ?7i(t) has a variance bounded from below by a constant times /f which is bounded 
away from zero for n large enough due to ([21]) and (A3). Moreover, the variance of ri'( 
is bounded. Hence one can apply Rice formula as in [7J to obtain that there are some 
positive constants C and D, such that for all x > 0, 



P(sup|r7;(r)| > x) ^ Cexpi-Dx"^), 



ree 



(31) 



which is the result obtained in [7]. Finally we deal with rjo by writing 



/ifc [cos2(27rA;(r - 6))/^ - 2 cos(27rA;(r - 6)) sin(27rA;(r - 9))fkgk 



+ sm\2nk{T-9))gl 



We have that 7o(6') = and 7"(^) ^ —{2TiYfl is bounded away from zero due to ( !29l) 
and (A3). Thus, similarly to [7j, one has 7o(t) — 7o(^^) ^ ~C\t — 9\'^ for all r G 0. Now 
note that for all real r, 



l7i(r)+72(r)| ^5^2^^fc + 



Hence using (1291) . the sum 7i(t) +72(r) is a 0(l/n) uniformly in r. The argument is now 
completed as follows. Using the obtained bounds, for any positive x, 



P9(\9-9\^Mff>x) ^Pe( sup {L{t) - L{9)) ^ 



^ Pfl I sup 

^V^||/'|||r-e|>x 



^o(r) - r^o(^) + 2^(r7i(r) - + -(^2(r) - 

n n 



> 



^P, I sup r]o{T) - r]o{9) + \t - 9\snp <2 '' '"'/l'"" ^ i 
,xA^||/'l||r-e|>x tee I ^ J 



^ Pfl I sup 

^v^||/'|||r-e|>x 



r - 9\^ \t - 9\ tee I V'^ n 
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Now setting x = x„ = K{lognY^'^ , for some positive constants Ci and C2 we have 

P»(|9-9|V"ll/'f >^n^ 



V tee I ^ J Xn 

^ Pe (sup |r/;(t)| ^ Ca;„ ) + (sup \r]'^it)\ ^ C/ExA . 

This is further bounded using (13 ip for the first term and Lemma 3 in ^ for the second 
term, which concludes the proof. □ 

Before we turn to the proof of Lemma \2A\ we state a Lemma which summarizes the 
properties of the criterion L, see Equation (IHUjl . and its derivatives. Note that it is in 
particular a natural adaptation of Lemma 6 in [7] . 

Lemma 5.3. Uniformly over 9 E Q and f E F, as n +00, 

nL'{ef) = ^Y.^27:kfhlfl + {l + o{l))^^^^ (32) 



E(L"(^)) = -2Y,hu{27^kf fl + o 



Moreover, uniformly over 9 E Q and f E F, as n ^ +00, 



(33) 



B{L"{9) -B{L"{9))Y = 0{n-') and B ( snp L^^\Cf^ = 0{l). (34) 

Vcee / 

Proof of Lemma 15.51 Let us denote 

^ n 1 " 

^J9) = - V v^cos(27rfc(ti - 9))ei and Cki^) = - V ^2 sin(27rA;(ti - 9))ei. 



n ^ — ' n 

1=1 1=1 



Simple calculations from (1301) lead to 

L'i9) = 2j2hki2nk)iK + n~'/'U9))ig, + n-'/'Ckm (35) 

L"(e) = 2Y,h{2nkf{-{f, + n-'/%{9)f + {g, + n-'/'Ckm'} (36) 



From (l35l) we deduce that 



E(LW) = Aj2hU2nkr(^ + ^) 

k^l ^ ^ 

+4 V hlC2i,kf (hg,' + -{2f,iK - /,) + ih - f,)'} + ^) 



; ^ hkhi{2Txk){2Txl)Jkfi9k9i- 

k^l 
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The second term in the last display is bounded using first (12911 and then Lemma 15.21 by 
OiWh'W/ii?), which is a o(||/z'|p/n^) since due to (C2), the norm \\h'\\ tends to +00. To 
bound the third term, note that it is bounded above by the square of 



k>l k>l ^ ^ ^ 



\h'\ 



log n 



using Lemma \^7I[ The corresponding square is thus a odl/i'lp/n^). For the second deriva- 
tive, from fl36l) we deduce that 



EiL"i9)) = 2j2hk{2nk)\-fk 
16 



gk 



(37) 



1 R ^ n 1 

E{L"i9)-BiL"i9))r = -J2hli2nknfk +g-k' + -: 

n ^-^ n 



k>l 



and proceeding as for using (^Uj) and Lemma \^?2\ one obtains ( IHHj) and the first part 
of (EH). Finally, the result about L^^^ is obtained as follows. Proceeding as in Lemma 6 
in [7j, one easily sees that 

sup |L(3)(C)| ^CJ2 hkk' f + gk' + + Ck')) ■ 
The deterministic part of the last display is bounded by a constant times 

hkk' {2f, + 2{fk - fkf + g-k') <cJ2hk (k'f, + e^) ^ c. 



To obtain the first inequality, using (l29l) we have bounded one fk^ fk and one gihy Ck/n 
and the other ones by a constant. The second inequality is obtained using (A3) and (C3), 
which concludes the proof of the Lemma. □ 

Note that all the dependence in and gk has vanished in Lemma 15. 3[ replaced by 
results in function of fk only. In fact, the results of this Lemma are exactly the ones 
used in |7] to prove the second order expansion, so in fact using this observation there is 
nothing left to prove to obtain Lemma 12.41 We include however the end of the proof for 
completeness. 



Proof of Lemma \2.4\ As in [7], the proof is in two steps. The first step is to prove that r 
defined by the following relation has the desired second order expansion, 

L'(0) + (f-0)E(L"(0)) = 0. (38) 

Let us evaluate ^{{9 - 9fln{f)), where 4(/) = n\\f'f. From ([32]) and ([33]) it follows 



Y,{{T-9fW)) 



k>l 



i+\\f'\rY.(h-i){2nkrf',+o{\\h'r/n) 
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Let us expand the square at the denominator and then use a Taylor expansion of the 
function x — > (1 + a;)^^ Then, one computes the product with the numerator and uses 
assumption (T). One obtains 

mr-efW)) = 1 + (1 + o{l))\\fr'R^{K /), 

which is the desired expansion for r. 

In a second step, we prove that 6 and r are close enough. It is sufficient to do this 
on the set Ai = {\9 — 9\ ^ D(n~^ logn)^/^} since the probability of its complement is 
negligible due to Lemma [2731 By definition of ^, we have L'{9) = 0. By Taylor's expansion. 



= L'{e) = L\e) + {e- e)L"{e) + ^ 

for some random variable C, which can also be written 

= L'{e) + {e -e)^{L"{e)) 
+{o-e)[L"{e)-^{L'\e))] + 

Subtracting (IH^ and (IH^ . one obtains 



-L(3)(C). 



(39) 



E 



rflA^{L'\e)f 



^ 2E 



^2{L"(e)-E(L"(^))}2u)+E 



e)Sup|L(3)(C)PU 
Cee 



□ 



Using (IMll and the definition of ^i, one obtains E((^ -f)2/„(/)l^J ^ Cn'Mog^ n which 
is a o{Rn{h, /)). Finally, by similar arguments, one also sees that E((6' — r)(r — is 
a o{Rn{h, /)) which concludes the proof. 

Proof of Lemma 12. 5t 

Proof. Starting from ( l39l) . using the triangle and Cauchy-Schwarz inequalities, 

E{L"{e))\ ^ \E{L'{e))\ 

+{e(9- eff/^ {E[L"{e) - E{L'\e))fY'^ 



E 



+ lE|(^-e)2sup|L(^)(C)||. 



The first term on the right-hand side of the last display can be bounded using ([29 



E{L'{e)) 



k9k 



k>l 



k>l ^ ^ 



which is a Oin ^) due to the assumption on / and (C3). To bound the second term, we 
use f[M|) and the fact that the result of Lemma 12.41 implies 



23 



We note that the latter relation could also be checked directly, similarly to the proof of 
Lemma 12.41 but without keeping second order terms. To bound the third term, we use 
Cauchy-Schwarz inequality 



1 /2 

E |(^- 9f sup |L(^)(C)|} ^ E ((^- 9)'^' E jsup \L^'\0\'^ . 

Now with Ai = {\9- 9\ ^ L)(n"Mogn)^/2 Lemma EBl 

E{i9-9r) = Eii9-9yu,)+Ei(9-9yu.) 

^ C(logn/n)^ + Cexp(-dD^logn). 

Choosing D large enough, we obtain that the third term is a 0(log?T,/?7,). The fact that 
|E(L"(0))| is bounded from above and below by positive constants, which follows from 
(IH7j) and (A3)-(C3), yields the announced result. □ 



Acknowledgments. The authors are grateful to Professor Alexandre Tsybakov for an 
insightful remark concerning this work. We also thank a referee whose comments lead us 
to correct a mistake in a previous version of this work. 

References 

[1] D. Bates and M. Lindstrom, Nonlinear mixed effects models for repeated measures 
data, Bzometncs,{AQ), 3, 673-687, 1990. 

[2] L. Brumback and M. Lindstrom Self modeling with flexible, random time transfor- 
mations. Biometrics, (60), 2, 461-470, 2004. 

[3] I. Castillo. Penalized profile likelihood methods and second order properties in semi- 
parametrics PhD thesis, Universite Paris-Sud, 2006. 

[4] I. Castillo. Semi-parametric second-order efficient estimation of the period of a signal, 
BernoulH, (13), 4, 910-932, 2007. 

[5] L. Cavalier, G.K. Golubev, D. Picard, and A.B. Tsybakov. Oracle inequalities for 
inverse problems. Ann. Stat, 30(3):843-874, 2002. 

[6] D. Chafai and J-M. Loubes. Maximum likelihood for a certain class of inverse prob- 
lems: an application to pharmakocinetics. SPL, 76:1225-1237, 2006. 

[7] A. S. Dalalyan, G. K. Golubev, and A. B. Tsybakov. Penalized Maximum Likelihood 
and Semiparametric Second Order Efficiency, Ann. Stat., 34(1):169-201, 2006. 

[8] A. S. Dalalyan, Stein shrinkage and second-order efficiency for semiparametric esti- 
mation of the shift. Math. Methods Statist, 161, 42-62, 2007. 

[9] S. Darolles, J-P. Florens, and E. Renault. Nonparametric instrumental regression, to 

appear in Econometrica, 2005. 

[10] M. Davidian and D. Giltinan. Nonlinear Models for Repeated Measurement Data: 
An Overview and Update. Journal of Agricultural, Biological, and Environmental 
Statstics, 8:387-419, 2003. 



24 



F. Gamboa and J-M. Loubes and E. Maza. Shifts estimation with M-estimators. 
Electronic Journal of Statistics, 616-640, 2007. 

T. Gasscr and A. Kncip. Searching for structure in curve samples. J. Amer. Statist. 
Assoc., 90:1179-1188, 1995. 

D. Gervini and T. Gasscr. Self-modelling warping functions. J. R. Stat. Soc. Ser. B 
Stat. MethodoL, 66(4): 959-971, 2004. 

A. Kneip and T. Gasser. Statistical tools to analyze data representing a sample of 
curves. Ann. Statist, 20(3):1266-1305, 1992. 

M. Lavielle and C. Levy-Leduc. Semiparametric estimation of the frequency of un- 
known periodic functions and its application to laser vibrometry signals, IEEE Trans- 
actions on Signal Processing, 53(7): 2306-2315, 2005. 

B. Lindsay. The geometry of mixture likelihoods: a general theory. Ann. Statist, 
ll(l):86-94, 1983. 

Loubes, Jean-Michel, Maza, Elie, Lavielle, Marc, Rodriguez, Luis Road traffick- 
ing description and short term travel time forecasting, with a classification method, 
Canad. J. Statist, 34(3), 475-491, 2006. 

F. Mentre and A. Mallet. Handling covariates in population pharmacokinetics. Int. 
J. Biomed. Comp., 36:25-33, 1994. 

J. Ramsay and X. Li. Curve registration. J. R. Stat. Soc. Ser. B Stat. MethodoL, 
60(2):351-363, 1998. 

B. R0nn. Nonparametric maximum likelihood estimation for shifted curves. J. R. 
Stat Soc. Ser. B., 69 (2): 243-259, 2001. 

A. B. Tsybakov. Introduction a V estimation non-parametrique. (Introduction to non- 
parametric estimation). Mathematiques & Applications (Paris). 41. Paris: Springer, 
2004. 

A.W. van der Vaart. Asymptotic statistics. Cambridge University Press, 1998. 

M. Vimond. Asymptotic efficiency of shifts estimators with M-estimators to appear 
in Annals, of Stat, 2008. 

[24] K. Wang and T. Gasser. Synchronizing sample curves nonpar ametrically. Ann. 
Statist, 27(2):439-460, 1999. 



25 



